Search CORE

Sensitivity analysis in HMMs with application to likelihood maximization

Author: Coquelin Pierre-Arnaud
Deguest Romain
Munos Rémi
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

International audienceThis paper considers a sensitivity analysis in Hidden Markov Models with continuous state and observation spaces. We propose an Infinitesimal Perturbation Analysis (IPA) on the filtering distribution with respect to some parameters of the model. We describe a methodology for using any algorithm that estimates the filtering density, such as Sequential Monte Carlo methods, to design an algorithm that estimates its gradient. The resulting IPA estimator is proven to be asymptotically unbiased, consistent and has computational complexity linear in the number of particles. We consider an application of this analysis to the problem of identifying unknown parameters of the model given a sequence of observations. We derive an IPA estimator for the gradient of the log-likelihood, which may be used in a gradient method for the purpose of likelihood maximization. We illustrate the method with several numerical experiments

Particle filter-based policy gradient for pomdps

Author: Coquelin Pierre-Arnaud
Deguest Romain
Munos Rémi
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

International audienceOur setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the belief state given past observations. We consider a policy gradient approach for parameterized policy optimization. For that purpose, we investigate sensitivity analysis of the performance measure with respect to the parameters of the policy, focusing on Finite Difference (FD) techniques. We show that the naive FD is subject to variance explosion because of the non-smoothness of the resampling procedure. We propose a more sophisticated FD method which overcomes this problem and establish its consistency

Optimal Policies Search for Sensor Management : Application to the AESA Radar

Author: Bréhard Thomas
Coquelin Pierre-Arnaud
Duflos Emmanuel
Publication venue: HAL CCSD
Publication date: 01/01/2007
Field of study

This report introduces a new approach to solve sensor management problems. Classically sensor management problems are formalized as Partially-Observed Markov Decision Process (POMPD). Our original approach consists in deriving the optimal parameterized policy based on stochastic gradient estimation. Two differents techniques nammed Infinitesimal Approximation (IPA) and Likelihood Ratio (LR) can be used to adress such a problem. This report discusses how these methods can be used for gradient estimation in the context of sensor management . The effectiveness of this general framework is illustrated by the managing of an Active Electronically Scanned Array Radar (AESA Radar)

arXiv.org e-Print Archive

Optimal Policies Search for Sensor Management

Author: Bréhard Thomas
Coquelin Pierre-Arnaud
Duflos Emmanuel
Vanheeghe Philippe
Publication venue: HAL CCSD
Publication date: 30/06/2008
Field of study

International audienceThis paper introduces a new approach to solve sensor management problems. Classically sensor management problems can be well formalized as Partially-Observed Markov Decision Processes (POMPD). The original approach developped here consists in deriving the optimal parameterized policy based on a stochastic gradient estimation. We assume in this work that it is possible to learn the optimal policy off-line (in simulation ) using models of the environement and of the sensor(s). The learned policy can then be used to manage the sensor(s). In order to approximate the gradient in a stochastic context, we introduce a new method to approximate the gradient, based on Infinitesimal Perturbation Approximation (IPA). The effectiveness of this general framework is illustrated by the managing of an Electronically Scanned Array Radar. First simulations results are finally proposed

Numerical methods for sensitivity analysis of Feynman-Kac models

Author: Coquelin Pierre-Arnaud
Deguest Romain
Munos Rémi
Publication venue: HAL CCSD
Publication date: 18/01/2007
Field of study

The aim of this work is to provide efficient numerical methods to estimate the gradient of a Feynman-Kac flow with respect to a parameter of the model. The underlying idea is to view a Feynman-Kac flow as an expectation of a product of potential functions along a canonical Markov chain, and to use usual techniques of gradient estimation in Markov chains. Combining this idea with the use of interacting particle methods enables us to obtain two new algorithms that provide tight estimations of the sensitivity of a Feynman-Kac flow. Each algorithm has a linear computational complexity in the number of particles and is demonstrated to be asymptotically consistent. We also carefully analyze the differences between these new algorithms and existing ones. We provide numerical experiments to assess the practical efficiency of the proposed methods and explain how to use them to solve a parameter estimation problem in Hidden Markov Models. To conclude we can say that these algorithms outperform the existing ones in terms of trade-off between computational complexity and estimation quality

A Dynamic Programming Approach to Viability Problems

Author: Coquelin Pierre-Arnaud
Martin Sophie
Munos Rémi
Publication venue: HAL CCSD
Publication date: 18/01/2007
Field of study

International audienceViability theory considers the problem of maintaining a system under a set of viability constraints. The main tool for solving viability problems lies in the construction of the {\em viability kernel}, defined as the set of initial states from which there exists a trajectory that remains in the set of constraints indefinitely. The theory is very elegant and appears naturally in many applications. Unfortunately, the current numerical approaches suffer from low computational efficiency, which limits the potential range of applications of this domain. In this paper we show that the viability kernel is the zero-level set of a related dynamic programming problem, which opens promising research directions for numerical approximation of the viability kernel using tools from approximate dynamic programming. We illustrate the approach using k-nearest neighbors on a toy problem in two dimensions and on a complex dynamical model for anaerobic digestion process in four dimensions

Crossref

HAL Descartes